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(54) Video coding device and video decoding device 



(57) A video coding device comprising: first coding 
means tor coding a video sequence of a background; 
second coding means for coding a video sequence of at 
least a part of a front image; and area-information cod- 
ing means for coding a binary area information repre- 
senting a shape of a part video, characterized in that the 
device is further provided;with a weight data preparing 
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means for preparing muttivalued weighting data from 
the binary area-information and gives weight to each of 
the video sequence according to tie weight data. 

A corresponding decoding device is also provided. 



FIG.15 



108 



FIRST 
VloeOOECOOWG 
PORTION 



106 



SECOND 
VIOEO-OECOOfNG 
PORTION 



107 

— V- 



AREA- 
MFORMATtON 
DECODING 
PORTION 



FRST 
WEIGHTING 
PORTION 



109 



SECOND 
WEIGHTING 
PORTION 



SYNTHESIZED 
VIDEO- 
SEQUENCE 



110 



ALPHA-PLANE 
GENERATING 
PORTION 



CODING DEVICE 



DECODING DEVICE 



LU 



Printed Dy Xerox (UK) Business Services 

2 16 70S 



EP0 961 496 A2 



Description v " 

BACKGROUND OF' WE I WEr^lON 

5 [0001] TT^ present Jrivention pertains to the field of art for digital video processing and relates particularly to a video 
coding device for encoding video data at a high efficiency and a video decoding device for decoding coded data pre- 
pared by ^d^video coding device at a Ngh efficiency. 

[0002] There > has been proposed a video coding method which is capable of encoding a specified area to be of a 
Ngher image quality than that of other areas. 
io [0003] A video coding method descrtoed in references ISO/IEC JTC1/SC29/WG 1 1 MPEG95/030 is such that selects 
aspectfied £rea and mates it (hereinafter referred to as selected area) encoded to have a higher image quality by con- 
tiofling^u^izer tfep »^ time resolution. 

[0004] Another conventional method shows an area-selecting portion intended to select a specified area of a video 
image itn ca^i of selecting, e g., a face area of a video image on a display of a video tetephone^rt isrxjssible toseiect 
is art area by using a method that is descried in a reference material -Real-time auto face-tracking system (The Institute 
of Image Eiebtrbn^ 

[0005]: ~ A^ position and a shape of a selected area. An optional 

shape may by usir^, eg., chain codes. The coded position and shape are assembled into coded data and 

transferred or accumulated by a coded-data integrating portion. 
20 [0006] A coded-pa^meter adjusting portion adjusts a variety of parameters usable fcr ccrrtrafling image quality or 
data amoumih vic^ so that the ar^-positi^arK^srepe coding position may encode a selected area to get 

aWgher imagec^ 

[0007] A pararrjHErter coding I portion encodes a variety of adjusted parameters. The coded parameters are assembled 
into coded data and transferred or accumulated a coded-data integrating portion. The video cocfing portion encodes 
25 input video data by using a vanety the:p a combination of conventional coding methods such as motion 

corrpensative prediction, orthogonal trahsforrratibh quaitization and variable length coding The coded video data is 
assembled into a coded data by tf^ coded data integrating portwi, then the coded data is transferred or accumulated. 
[OOOQ TheselMte* 

30 tityof bitstrw^b^^ 

tional art, however, indudes sUch problems that it can not obtain a specified area image by decoding a part of decoded 

data and/br obtain a decoded area image having a r^ativeiy tow quatity because of a^atec*^ 

being included in thi same grotp of coded data. Recentry, many studies have been made on hienuchi^ 

coded jrt^ but haw 

35 [0O10] There has been studied a video coding method which is adapted to synthesis of video 

sequences. ; ■* • ? 

[001 1 ] A paper "Image coding using hierarchical representation and nuitipte te m pla tes" appeared in Technical Report 
of IEJGE (Institute of Electronics Ihtormation ar«j Communication Engineers) £94-159, pp. 99 - 106. 1995, describes 
such an image syrthes<zwig method that combines a video-sequence being a background video and a part Mrideo- 
40 sequence being a ^^rrpurkl video (eg., afigire image orafish image cut-out by using the chroma^ 
produce a new sequence. 

[0012] In a oonventional method, a first video-sequence is assumed to be a background video and a second video- 
sequence is assumed to be a part video. An alpha plane is weight date used when synthesizing a part image with a 
background image in a moving picture (video) sequence. There has been proposed an exempfif ied image made ctf pix- 
45 els weighted with values of ltd 0. The alpha-plane data is assumed to be 1 within a part and 0 out of a part The alpha 
data may have a value of 6 to 1 in a boundary portion between a part and the outside thereof in order to indicate a mixed 
state of pixef valuer 

[0013] In titeconventk^ method afirst video-axJing portk^ 

coding portion encodes the second video-sequence according to an international ete^ 
so MPEG or H^61. An ajpha-plane cocfing portion encodes an alpha^ane. Ui the above-me^ 

uses the tec*irtiques ^ ve^ and Haar transfer mation. A cxxied-dafta integrating p^^ 

integrates coded data received from the coding portions and accumulates or transmits the integrated coded date- 
[0014] In the decoding device of the conventional method, a coded-data (fissembling portion (not shown) disassem- 
bles coded data into the coded data of the first video-sequence, the coded data of the second video-sequence and tie 
55 coded data of the alpha-plane, which are then decoded respectively by a first video^lecoding portion, a second video- 
decoding portion and an alpha-plane decoding portion! Two decoded sequerces^ ^ to 
weighted mean values by a first weighting portion, a second weighting portion and adder. Trie first video-sequence and 
the second video-sequence are combined according to the following equation: 
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f(x,y.t) = (1 - a(x.y,t))f1(x,y.t) + a(x.y,t)f2(x,y,t) 

In the equation. (x,y) represents coordinate data of an intraframe pixel position, t denotes a frame time, fl(x.y.t) repre- 
sents a pixel value of the first video sequence, f2(x,y,t) represents a pixel value of the second video sequence, f(x,y,t) 
5 represents a pixel value of the synthesized video sequence and a(x,y,t) represents alpha-plane data. Namely, the first 
weighting portion uses 1 -a(x,y.t) as a weight while the second weighting portion uses a(x,y,t) as a weight As mentioned 
above, the conventional method produces a large number of coded data because it must encode alpha-plane data. 
[001 5] To avoid this problem, saving the information amount by binarizing alpha-plane data may be considered, but it 
is accompanied by such a visual defect that tooth-like line appears at the boundary between a part image and a back- 
to ground as the result of discontinuous change of pixel values thereabout. 

[0016] There has been studied a video coding method that is adapted to synthesize different kinds of video 
sequences. 

[0017] A paper "Image coding using hierarchical representation and multiple templates" appeared in Technical Report 
of IEICE IE94-159, pp. 99 - 106, 1995, describes such an image synthesizing method that combines a video-sequence 
75 being a background video and a part-video-sequence being a foreground video (e.g. , a f igure image or a fish image cut- 
out by using the chromakey technique) to produce a new sequence. 

[0018] A paper "Terrporal Scalability based on image content" (ISO/IEC JTC1/SC29/WG11 MPEG95/21 1. (1995)) 
descr toes a technique for preparing a new video-sequence by synthesizing a part-video sequence of a high frame rate 
with a video-sequence of a tow frame rate. This system is to encode an lower-layer frame at a low frame-rate by predic- 
ts tk>n coding method and to encode only a selected area of an upper-layer frame at a high frame rate by prediction cod- 
ing. The upper layer does not encode a frame coded at the lower-layer and uses a copy of the decoded image of the 
lower-layer. The selected area may be considered to be a remarkable part of image, e.g., a human figure. 
[001 9] In a conventional method, at the coding side, an input video-sequence is thinned by a first thinning portion and 
a second thinning portion and the thinned video-sequence with a reduced frame rate is then transferred to an upper- 
25 layer coding portion and an lower-layer coding portion respectively The upper-layer coding portion has a frame rate 
higher than that of the lower-layer coding portion. 

[0020] The lower-layer coding portion encodes a whole image of each frame in the received video-sequence by using 
an international standard video-coding method such as MPEG. H.261 and so oh. The lower-layer coding portion also 
prepares decoded frames which are used for prediction coding and. at the same time, are inputted into a synthesizing 
30 portion. 

[0021] In a code-amount control portion of a conventional coding portion, a coding portion encodes video frames by 
using a method or a examination of methods such as motion compensative pretfiction. orthogonal transformation, 
quantization, variable length coting and soon. A quantization-width (step-size) detenrening portion determines a quan- 
tization width (step size) to be used in a coding portion. A coded-data amount de te i m inin g portion calculates an accu- 
35 mutated amount of generated coded data. Generally, the quantization width is increased or decreased to prevent 
increase or decrease of coded data amount 

[0022] The upper-layer coding portion encodes only a selected part of each frame in a received video-sequence on 
the basis of an area information by using an international standard video-coding method such as MPEG, H.261 and so 
on. However, frames encoded at the lower-layer coding portion are not encoded by the upper-layer coding portion. The 

40 area information is information indicating a selected area of. e.g., an image of a human figure in each video frame, 
which is a binarized image taking 1 in the selected area and 0 outside the selected area. The upper-layer coding portion 
also prepares decoded selected areas of each frame, which are transferred to the synthesizing portion. 
[0023] An area-information coding portion encodes an area information by using 8<firectional quantizing codes. The 
8-directional quantizing code is a numeric code indicating a direction to a proceeding point and it is usuafly used for rep- 

45 resenting digital graphics. 

[0024] A synthesizing portion outputs a decoded lower-layer video-frame which has been encoded by tower-layer cod- 
ing portion and is to be synthesized. When a frame to be synthesized but has not been encoded at the fower-iayer cod- 
ing portion, the synthesizing portion outputs a decoded video-frame that is generated by using two decoded frames, 
which have been encoded at the lower-layer and stand before and after the tacking lower-layer frame, and one decoded 

so upper-layer frame to be synthesized. The two lower-layer frames stand before and after the upper-layer frame. The syn- 
thesized video-frame is inputted onto the upper-layer coding portion to be used therein for predictive coding. The image 
processing in the synthesizing portion is as follows: 

[0025] An interpolating image is first prepared for two lower-layer frames. A decoded image of the lower-layer at time 
t is expressed as B(x.y.t). where x and y are co-ordinates defining the position of a pixel in a space When the two 
55 decoded images of the tower-layer are located at time t1 and t2 and the decoded image of the uppeMayer is located at 
t3 (tl <t3<t2). the interpolating image I(x.y,t3) of time t3 is calculated according to the foBowtng equation (1): 

I(x.y t t3) = [(t2-t3)B(x.y.t1) + (13-t1)B(x.y.t2)]/(t2-t1) (1) 
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The decoded image E of the upper layer is then synthesized with the obtained interpolating image I by using synthesiz- 
ing weight information W(x.y,t) prepared from area information. A synthesized image S rs defined according to the fol- 
lowing equation: 

5 ; . ^*t)^ ; (2) 

The area irifemati^ M The 
weigrrt information 

low-pass f^. Narr^y, the weight information W(x,y,t) takes 1 wittin a selected area, 0 outside the selected area and 

io a value of 6 to^i ^ boui^ry of the selected area. • . ^ 

[0026] The coded data prepared by the loweHayw coding portion, the upper-layer coding portion and the area infor- 
mation codl^ poi^ by an integrating portion (not shown) and then is transmitted or accumulated. 
[0027] in the decqcfing side of the conventional system, a coded data disassembling portion (not shown) separates 
r coded data tr^^^^ coded data These coded data 

is are decoded : fesqa^My by an tower-layer decoding por^ arid an area information 

decoding por^^;:^-^-'; : :' v*Y/V '*• ■ 7. . ■ • ; . . /- : : ; ' : / • " . "■ 
[0028] Asyrthesi^^ 

an image by using a decoded ^ and a decoded upper-layer image acaxwrfng to frte same method as 

d£scrfeed at tr^ jpc^ same time, is 

20 irputted into ^ r 

[0029] The atw^escribed decoding device decodes both lower-layer and the upper-layer frames, but a decoding 
device consisting <rf an Ic^er-tayer decocfing portion is also applied, omitting the upper-layer coding portion and the syn- 
thesize pbrtioR^^^ decoding device can reproduce a part of coded data. 
[00301^ be solv^ by the present invention: 

25 'Z"/\-:[ :: * : \T . ]' . . y • ■■ ~ '. '■■ : -. ' '.. '. '[' 

(1) As rriehtk>n^ an output vnage from two lower-layer decocted images and 
••' cm u)per-ky^ d sparing an ^ image of two l<*ver-layer frames and. 

therefore/ e^^ image may be ^ 

^ fa 

30 ; ^ • 

r : trra^g^^au^^O two cleoocieci; low^erHaye^ frames ard an image B is a decoded t^Dpef^^er frame. The 
images are displayed in the time order A, B and C. Because the selected area 'tm^mfj^ 
, rrar^ fr^the^ra^ A and B shows two selected areas overlapped with each other The ima^ 
' thesized with tr^ by using weight i nformatfort Ah output image has to 

35 overlapped Pwj&i: each other Two selected areas of the lower-layer image appear "ikeh^ea^ 
selects area khage of tfte upper-layer, thereby considerably deteriorating the quafity of ah^ 
tower-layer frame^^ 

sequence rray be displayed wto cfstortion that considerably inp^ 

(2) Tfe codes far encoding area trfor mati on. In case of encoding 
40 the surra-infor^ low bit-rate or of a corr^icated-shape area an amocrt xk coded area- ii ifor ma ton 

increases and 'takes a targe portion of the total amount of coded data, that may cause the de te rio ration of the image 
quality. ' : [ r -.'/'r.' 

(3) The conventfonstf art obtains weight information by making the area information pass through a low-pass fitter 
several times. This increases an amount of processing operations. 

45 (4) the a>nvehtkx^ ^ uses precfictive codng method. However, the predictive cocfing the lower-layer frames may 
cause a large distortion if a screened 
propagate a ^ 

(5) According to the conventional art, each lower-layer frame is encoded by using an intemafonal standard video- 
co<^ method (e.g.; 

so the contrary, in each upper-layer frame, only a selected area is encoded to be of a h^ 

of tie selected area image may vary with time. This is sensed as a ffcfcar-Kke (fisttMd^' < ti§|llB a problem; 

SUMMARY OF THE INVENTION 

55 [0031] Accordingly, an object of tie present invention is to provide coding and decodng devices which are capable 
of encoding a selectively specified area of a video image to be of a relatively high image<juafity in a whole coded video 
data system and which is also capable of giving a hierarchical structure of the coded dm make it possible to reproduce 
the specified area of the coded video image to be of a variety of image-quality and/or and any other area to be of a rel- 
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atively low image-quality. 

[0032] With the thus constructed coding and decoding devices, a selected area of an image can be encoded and 
decoded to be of a higher image-quality than that of other areas by differentiating values of parameters such as spatial 
resolutiQn, quantizer step sizes and time resolution. The coding device can make coded data have respective hierarchi- 
cal orders and. therefore, the decoding device can easily decode a part of coded data. 

[0033] Another object of the present invention is to provide a coding device and a decoding device, which are capable 
of generating a synthesized image from a reduced amount of coded data without deterioration of the synthesized image 
quality. 

[0034] With the coring and decoding devices according to the present invention, the decoding device can prepare 
weight information for synthesizing a plurality of video-sequences by using weighted means, eliminating the necessity 
of encoding weight information by the coding device. 

[0035] The coded data are weighted, that may totally save an amount of data to be produced. 

[0036] The reverse weighting, which is performed by the decoding side, may generates weight-removed decoded 

data. 

[0037] Another object of the present invention is to provide a coding device and a decoding device, which are free 
from the above-mentioned problems (described (1) to (5) as problems to be solved in prior art) and are capable of 
encoring video-frames with a reduced amount of coded data without deterioration of the image quality. 

BRIEF DESCRIPTION OF THE DRAWINGS 

[0038] 

Fig. 1 is a block diagram for explaining a prior art 

Fig. 2 is a view for explaining a concept of coding method according to the present invention. 

Fig. 3 shows an example of a concept of decoding method according to the present invention. 

Fig. 4 shows another example of a concept of decoding method according to the present invention. 

Fig. 5 is a block diagram showing a coding device representing an emboriment of the present invention. m 

Fig. 6 shows.an exemplified method of encoding lower-layer, first upper-layer and second upper-layer coded data 

by a coding device according to the present invention. 

Fig. 7 shows another exemplified method of encoding first upper-layer coded data by a coding device according^ 
the present invention. * 
Fig. 8 shows another exemplified method of encoding lower-layer, first ipper-layer and second upper-layer coded 
data by a coding device according to the present invention. 

Fig. 9 shows another exemplified method of encoding lower-layer, frst upper-layer and second upper-layer coded 
data by a coding device according to the present invention. 

Fig. 10 is a block diagram showing a decoring device representing an embodiment of the present invention. 

Fig. 11 is a block diagram showing a decoring device representing another ernbodiment of the present invention. 

Fig. 12 is a block diagram showing a decoring device representing another embodiment of the present invention. 

Fig. 13 is a block diagram tor explaining a conventional method. 

Fig. 14 shows an example of an alpha plane according to a conventional method. 

Fig. 15 is a block diagram tor explaining an emboriment <* the present invention. 

Fig. 16 shows an example of an area information according to the present invention. 

Fig. 17 shows an example of a linear weight function according to the present invention. 

Fig. 18 shows an example of preparing an alpha-plane according to the present invention. 

Fig. 19 is a block diagram for explaining another embodiment of the present invention. 

Fig. 20 is a block diagram for explaining an example of a video-coding portion in another embodiment of the present 
invention. 

Fig. 21 is a block diagram for explaining an example of a video-decoding portion in another embodiment of the 
present invention. 

Fig. 22 is a block diagram for explaining another example of a video-coding portion in another embodiment of the 
present invention. 

Fig. 23 is a block diagram for explaining another example of a video-decoding portion in another embodiment of the 
present invention. 

Fig. 24 is a block diagram for explaining another example of a video-coring portion in another embodiment of the 
present invention. 

Fig. 25 is a block diagram for explaining another example of a video-decoding portion in another emboriment of the 
present ffivention. 

F*g. 26 is a block diagram for explaining an exemplified case that an area information is not encoded in another 



i 



;V :1 ^ 1 EP0 961 496 A2 

ent>odimerrt of the pres€^ irwe^ 
. f^2^sfiov^^ 

Fig;. 29 tsV'tt*^ corwentiqnal method for controlling the number of codes. 

V F^;^i£^ ?: . -. 

F^|^te;abk^ 

34 is a Wock diagram for explahing Wiother errtwdment ol the pres^ invention. 
10 Rgf 3$ fe^a?bfock ^a^ram for;epcplaining a cocfing side of another ertibod iment of the present invention. 

^^^Wwock diagram lor explaining ^decoding side of anoth ef errt3ocfeT«rt of the present invention. 

Rg-^ 

"F^3^rtp^^ 

is 

R&4r^ 

F^ A^t& a ct txxtes usa^ area by a code-amount 

«>rt^ V- 

R^ ^ is a Viw for explaining a target c»effoeht of codes usable for coding an area outeide a selected area by a 
20 c^^arnou 

n#44isla^ 

arie-arni^ method according to the present invention. 

PREreRRED EM 

I0Q391 : FFi^^l is block diagram shewing a prior art as a reference to the resent inver^^ 

20 is ir#end& to swiect a spec? iedarW of ^ image on a 

-fteal*i^ 

30 Sod^Meeti^ '^/^ \" S.-V 4 - • v ••-•<><• • ; ; • ^ -f. \ • /,- 

[004^^ portbh 21 encodes a 

An opticrol s^ chain codes. The cocHti ppsttj^ 

cod^;<fete-i^6^^ ' 
[0041] vvtbeta^ 

55 date amount ;irt vk^ ishcocfing so mat the arMipqsrtibn^^ 

.ge*a;hfigh^ ^ry'y-^^ . 

[OOC] A parameter coring portion 24 encodes at variety of adjusted parameters The coded parameters are assem- 
bled into cbdted data arid transferred or accumUated by a coded-data integrating portkxr 22. The ^ 
25 encodes input video data by using a variety of the parameters by a combination oT 

40 such as : motion compensative prediction, orthogonal transformation, quantization and variabte length coring. The 
coded video data is assembled into a coded data by the coded date irrtegirBrting ^ data is trans- 

ferred or acai^&ed.T^ 

[0043] f^. 2 is a view for expfaini^ The Nerar- 

c*wcalenc^^^ 

45 At the loweMa*Wi a selected area (hatred area) is encoded with a relatively low image-quafty, A leinarkable fene is 
denoted &t t and a decoded image of the time t is denoted by L(t): At the first upper layer, a wrx>*e image is encoded to 
be of a relatively law image-quafity. A decoded image of this layer is denoted by H1(t). ^ cocfcig is 

made by using the decoded knage of the lower-layer L(t) and the decoded image of the first upp t^ 1 )- 
second upper layer; orty the selected area is predictivety encoded to be of a higher image-qual^ 

so layer. The decoded image c* this layer is denoted by H2fl); biihfe case, predttve coding is rWadeiby using decoded 
image of to£ iower^ay^ L(t) and W decode image of «ie second upper-layer H2(MK y:-. r irzM? 
[0044] Rgs 3 and 4 are illustrate a 
trree-te^decodirig p 

all layers data. In this case, decoding the lower-layer data reproduces only an image setec^ by ^ 
55 be of a relatively low image^iuality. decocfirig the first tpper-teyer date reproduces a whole image of a relatively km 
image-quality and decoding all coded data reproduces the selected area of a higher image-quality and as other areas 
of a lower image-epjamy. On the other band. Fig. 4 shows a case when all coded signals ore decoded ater decoding 
the second upper-layer data instead of the first upper 4ayer data. In this case, an irtermecferte layer (the second upper- 
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layer) data is decoded to reproduce a selected image-area only of a higher image-quality. 

[0045] With the decoding device according to the present invention, only a selected image area of a lower image- 
quality is reproduced from coded data corresponding to the lower-layer while a whole image <rf a lower image-quality or 
only a selected area of a higher image-quality is reproduced from coded data corresponding to the upper layer. Namely. 
5 any one of two upper layers above a common lower-layer can be selected. 
[0046] An embodiment of the present invention will be described as follows: 
[0047] Fig. 5 is a block diagram showing a coding device embodying the present invention. 

[0048] In Fig. 5. an area selecting portion 5 and an area-position-and-shape coding portion 6 are similar in function 
to those of trie prior art shown in Fig. 1. 
io [0049] In Fig. 5^ an lower-layer coding portion 4 encodes only an area selected by the area selecting portion 5 to be 
of a lower image-quality, prepares coded data of the lower-layer and generates a decoded image from the coded data. 
The decoded image is used as a reference image for predictive coding. 

[0050] A first-layer coding portion 1 encodes a whole image to be of a lower image-quality, prepares coded data of 
the first-layer and generates a decoded image from said coded data. The decoded image is used as a reference image 
75 for predictive coding. 

[0051] A second-layer coding portion 2 encodes only a selected area image to be of a higher image-quatty. prepares 
coded data of the second-layer and generates a decoded image from said coded data. The decoded image is used as 
a reference image for predictive coding. 

[0062] A coded data integrating portion 3 integrates selected-area position-and-shape codes, lower-layer coded data, 
20 first upper-layer coded data and second upper-layer coded data. 

[0053] There are several kinds of encoding methods which are applicable in the lower-layer coding portion 4, the first- 
layer coding portion 1 and the second-layer coding portion 2, which will be descrfced as foBows: Figs. 6 and 7 are illus- 
trative of the technique of controlling the lower-layer image-quality and the upper-layer image quality depending upon 
quantization steps. 

25 [0054] . Fig. 6(a) illustrates how to encode the lower-layer image data. A hatched area represents a selected area. At 
the lower-layer, a selected area of a first frame is intraframeiy encoded and selected areas of other remaining frames 
are predictively encoded by motion-compensative prediction method. As a reference image for the motion compensa- 
tive prediction is used a selected area of a frame of the tower-layer, which has been already encoded and decoded. 
Although only forward prediction is shown in Fig. 6(a). it may be applied in combination with backward prediction. 

30 Because the quantization step at the lower-layer is controlled to be larger than that at the second upper layer, onty a 
selected area of an input image is encoded to be of a lower image-quality (with a low signaMo-notse ratio). Conse- 
quently, the lower-layer image-data is^^^ 

[0065] Fig. 6(b) illustrates how to encode the first upper-layer image data. At this layer, a whole image is encoded. 
For example, a whole image is encoded by predictive coding based on a decoded image of the lower-layer and a 
35 decoded image of the first upper-layer. In this case, a whole image of the first frame is encoded by prediction from frie 
lower-layer decoded image (areas other than selected one are intraframeiy encoded in practice because the motion- 
compensative prediction method can not be applied in practice). Other frames can be encoded by using the predictive 
coding in combination with the motion compensative prediction. 

[0056] Such a variation is also applicable, which does not encode a selected area and encodes only other areas by 
40 the predictive coding method as shown in Fig. 6. The encoding process is performed for areas other than the selected 
one. 

[0057] Fig. 6(c) illustrates how to encode the second upper-layer image data Only a selected image area is encoded 
at a relatively small quantization step In this case, objective data to be encoded is differential data obtained between 
original image data and image data predicted from the lower-layer image data. Although only prediction from the lower- 
45 layer image data is shown in Fig. 6(c). it may be used in combination with the prediction from a decoded frame of the 
second upper-layer. 

[0058] Fig. 8 is a view for explaining a method of controlling the lower-layer image quaSty and the upper-layer image 
quality by using differentia) time resolution values. 

[0059] Fig. 8(a) illustrates how to encode the lower-layer image data. A hatched area represents a selected area. At 
so the lower-layer, a selected area of a first frame is intraframeiy encoded and selected areas of other remaining frames 
are predictively encoded by motion-compensative precfiction. As a reference image for the motion compensative pre- 
diction is used a selected area of a frame of the lower-layer, which has been already encoded and decoded. Although 
only forward prediction is shown in Fig. 8(a). it may be applied in combination with backward precfction. The frame-rate 
of the lower-layer is so decreased that time resolution is adjusted to be tower than that at tfie second upper layer, ft is 
55 also posstole to encode frames at a smaller quantization interval so that each frame may have a larger stgnaMo-noise 
ratio. 

[0060] Fig 8(b) illustrates how to encode the first upper-layer image data. A whole image is encoded with a low time- 
image-resolution. In this case, it is posstole to apply the coding method similar to that shown in Fig. 6(b) or Fig. 7. 
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[0061] , Fig. 8(c) illustrates how to encode the second upper-layer image data. Only a selected area is encoded with a 
higher time resolution. In this case, a frame whose selected area was encoded at the lower-layer is encoded by pretic- 
tion from the lower-layec decoded image, whereas all other frames are encoded by motion compensative prediction 
from the already decoded frames of the upper-layer. In case of using prediction from the lower-layer decoded frame, it 
5 is possible; nc^ any second upper-layer image data/adopting the lower-layer decoded image as a second 

upper-layer decoded image 

[0062] i Fi g , 9 is a! vi ew for expla i n i ng a method of the lower-layer image quality and the upper -layer image quality by 
using differential ^ti^ resolution values. 

[0063] RgV^a) ffl^^ how to encode the lower-layer image data. An original image is converted into an image of 
10 a lower spatial reso<Lrtk>n throu^i a low-pass titer or thinning operation. Only hatched selected areas are encoded. At 
the lower-layer* a selected area of a first frame is intraframely encoded and selected areas of other remaining frames 
are prediirively ^ 

[0064]; Fig. S^) illustrates how to encode the first upper-layer image data. An original image is converted into an 
image bf a lower spatial resolution and a whole image Is encoded with a higher time-resolution. In this case* it is possi- 

is Ue to apply the codng rnetfiod similar to that shorn in Fig. 6(b) or Fig. 7. 

[0065];; Ftg. 9(c) ifiustrates how to encode the second upper-layer image data. Only a selected area is encoded with a 
higher spatial resolution. In this case, a decoded image of the lower-layer is converted into an image having the same 
spatial Presdluiidh as an original image and selected areas are encoded by prediction from the lower-layer decoded 
image and by motion compensative prediction from the already decoded frame of the second upper-layer. 

so [0066] Trfe image-quality controlling methods us^ 

time resolution and spatial resolution may be also appli^ in contynation with one another. 

: [0067] ; For exarrpte, rt is possfole to adjust the lower-layer image-quality and the upper-layer image quality by using 
a corr^nati^ a differential time resolution or by using a combination of a differ- 

ential quantization step and a drffecertial time re^trbon. 
25 [0068] A selected area in a whole irrage is thus erxxxied to At 
the same time, the coded dam is given respective one of three hierarchical layers (two upper layers and one lower- 
layer): • " . . 
[0069] Decoding devices;*^ 
[0070]lh 10 

30 irrter*ied todeca^^ ' 

[0071] injFig -4i6» a coded data separating porfon 7 is intended to separate coded data trto area-pOG$on-and-shape 
coded data and lower-layer boded image data and selectively extract desired coded data S^^^gS v:- ; 
[0072] An area-posrtiorvand-shape decoding portion 9 is intended to decode a position code arri a shape code of a 
' selected ar^ ^ "-..^ . , . V . ' '" * ' ' ' 

35 [0073] Ah kiw 

pare a lower-qu€^ ; deooded image of the selected area onry 

[0074] Ac co rcingr y , each image outputted from this decoding d image in fer mab on of a selected area 

only which is indicated as a window on a display screen. The lower-layer decoding portion 8 may be provided with a 
spatial resolution converter to enlarge ^ screen size and indicate it on the display screen. 

40 [0075] The shown embodiment may obtain decoded images of a tower quafity because of decoolng only lower-layer 
data of a selected area, but it may be simple in hardware construction omitting an upper-layer decodHig portion and may 
easily decode the coded image by processing an decreased amoum 

[0076], Fig; 1 1 is Kustrative erf a second embodiment of a decoding device according to the present invention, wherein 
an area-poshioiv^ portion 9 and an lower-layer decoding pc^on 8 are sim^ in turx^on to thoseof 

45 the first embodiment ■„ ■' 

[0077] In Fkj. 11, a coded data separating portion 10 separately extracts, from coded data, area-position-and-shape 
coded data, lower-layer coded data of cm area and first upper-layer coded data. 

[0078] A first upper-layer decoding portion 1 1 decodes a first upper-layer coded data, wr»eby a whole image e 
decoded to be of a tower quafity by using area-poeitk>n-and-shape data, tie lower-layer decoded inr^ 

so upper-layer decocted image. A first upper-layer decoded image is tius prepared, --^p-i 

[0079] Although the shown errtoocBment uses the first upper-layer coded data, it may also use the seoondijpper-layer 
instead of the fkst upper-layer. In this case, the coded data separating portion 10 separately extracts, from coded data, 
area-posrtion-arid-shape coded data, lower-layer coded data of an area and second upper-layer coded data. The first 
upper-layer decoding portion 1 1 is replaced by a second upper-layer decoding portion which decodes a second upper- 

55 layer coded data by using the area-posrtion-and-shape data, the lower-layer decoded image and tie second upper- 
layer decoded image and only the selected image is decoded to be of a higher quality, A second upper-layer decoded 
i mage thus prepared may be displayed as a window on a display screen or be enlarged to full-screen size and then cfe- 
piayed thereon. 
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[0080] Fig. 1 2 is illustrative of a third embodiment of a decoding device according to the present invention, wherein 
an area-pc*rtion-and-shape decodng portion 9 and an lower-layer decoding portion 8 are similar in function to those 
shown in Fig. 2. 

[0081] In Fig. 12, a coded data separating portion 10 separately extracts, from coded data, area-posrtion-and-shape 
data, lower-layer coded data, first upper-layer coded data and second upper-layer coded data. 
[0082] A first upper-layer decoding portion 1 1 decodes a first upper-layer coded data, while a second upper-layer 
decoding portion 1 3 decodes a second upper-layer coded data. 

[0083] An upper-layer synthesizing portion 1 4 combines a second upper-layer decoded image with a first upper-layer 
decoded image to produce a synthesized image by using information on the area position and shape. The synthesis of 
a selected area is conducted by using the second upper-layer decoded image, while the synthesis of other areas is con- 
ducted by using the first upper-layer decoded image. Therefore, an image outputted from the decoding device relates 
to a whole image wherein a selected area is particularly decoded to be of a higher quality as to parameters such as 
SNR (Signal-to-Notse Ratio), time resolution and spatial resolution. An area selected by the coding device is thus 
decoded to have a higher quality than that of other areas. 

[0084] Fig 13 is a block diagram showing a conventional device as a reference to the present invention. A first video- 
sequence is assumed to be a background video and a second video-sequence is assumed to be a part video. An alpha 
plane is weight data used when synthesizing a part image with a background image in a moving picture (video) 
sequence Fig. 14 shows an exerrplified image made of pixels weighted with values of 1 to 0. Tbe alpha-plane data is 
assumed to be 1 within a part and 0 out of a part. The alpha data may have a value of 0 to 1 in a boundary portion 
between a part and the outside thereof in order to indicate a mixed state of pixel values in the boundary portion and 
transparency of transparent siA>stance such as glass. 

[0085] Referring to Fig. 1 3 showing the conventional method, a first video-coding portion 101 encodes the first video- 
sequence and a second video-coding portion 102 encodes the second video-sequence accorcfing to an international 
standard video-coding system, e.g.. MPEG or H.261. An alpha-plane coding portion 112 encodes an alpha-plane, m 
the above-mentioned paper, this portion uses the techniques of vector quantization and Haar transformation. A coded- 
data integrating portion (not shown) integrates coded data received from the coding portions and accumulates or trans- 
mits the integrated coded data. j£ 
[0086] In the decoding device of the conventional method, a coded-data dissembling portion (not shown) disassem- 
bles coded data into the coded data of the first video-sequence, the coded data of the second video-sequence and ttje 
coded data of the alpha-plane, which are then decoded respectively by a first video-decoding portion 105. a secor|j 
video-decoding portion 106 and an alpha-plane decocfing portion 113. Two decoded sequences are synthesiz^l 
accordng to weighted mean values by a first weighting portion 1 08. a second weighting portion 10 9 and a dder 111. The 
first videa«equeriee and the second video-sequence are combined acoor^^ 

«x,y.t) = (1 . a(x.y,t))fl(x I y,t) + a(x,y.1)f2(x.y,t) 

In the equation. (x.y) represents coordinate data of an intraframe pixel position, t denotes a frame time. f1(x,y,t) repre- 
sents a pixel value of the first video sequence. f2(x,y,t) represents a pixel value of the second video sequence. f(x,y,t) 
represents a pixel value of the synthesized video sequence and a(jc.y,t) represents alpha-plane data. Namely, the first 
weighting portion 108 uses 1 -a(x.y.t) as a weight while the second weighting portion 109 uses a(x, y B t) as a weight. 
[0087] As mentioned above, the conventional method produces a large number of coded data because it must encode 
alpha-plane data. 

[0088] To avoid this problem, saving the information amount by binarizing alpha-plane data may be considered, but it 
is accompanied by such a visual defect that tooth-like line appears at the boundary between a part image and a back- 
ground as the result of discontinuous change of pixel values thereabout 

[0089] Fig. 1 5 is a block diagram showing a coding device and decoding device embodying the present invention. In 
Fig. 1 5, a first video-coding portion 101. a second video-coding portion 102. a first video-decocfing portion 105, a sec- 
ond video-decoding portion 106, a first weighting portion 108. a second weighting portion 109 and adder 111 ar eanriar 
in function to those of the conventional device and. therefore, will not be further explained. In Fig^^an area-informa- 
tion coding portion 103 encodes an area information representing a shape of apart image of a second vid eo-seq uence , 
an area-information decoding portion 107 decodes the coded area-information and an alpha-plane generating portion 
110 prepares an alpha plane by using coded area information. 
[0090} The operations of the coding device and the decoding device are as foBcws: 

[0091 ] The coding device encodes the first video-sequence and the second video-sequence bythe first video-coding 
portion 101 and the second video-coding portion 102 respectively and encodes an area information by the area-intor- 
mation coring portion 103 according to a method to be descrfoed later. These coded data are integrated for further 
transmission or accumulation by a coded-data integrating portion (not shown). On the other hand, the decoding device 
separates the transmitted or accumulated coded data by the codecMata separating portion (not shown) and decodes 



the separ^^ data by the f^t vdeo^ecoding portion video-decoding portion 106 and the area- 

information dec^ng portion 107 resp^v^ prepares an alpha-plane from 

the decoded a^Snforrratibn by almethcd to 1 08i the second weighting 

portion 1 Q9 : and ii^ adder 11 1 may synthesize two decoded sequences by, using Righted mean values accortfng to 
r s the pfepw^ • r '^f;' : . •• ■" "• 

[0092] :^Fig "..-16 shows an eaample x?f area jnforrhatipn that <^«tpdnds to an area information of a part video-image 
of ^l^the' are£ information is bihanze^ i&rx^ information may be thus obtained by biha- 

rizingtte^^ 

area by a method desai^ a reference mortal "f^ttirne facfcirraigie f<^ Institute of Image 

io El^ohi^ 19Sb)/»itormatioh tobe 

used may be a rectangle. In this instance, area tnftxrration & 1 within a body and as 0 outside the 

--bo^^''^ ;: ^MC^^: • ; Vr^. .-' : ^': >:: %^ -•- ' ' '. 'P' • ^ ■ - ' ' : '-=V 1 ' '■ 

|0p83J^ S^prskia^^ irrforr^OT^^^ explained in detail, may be run-length 

f coding and chain coding since the area intimation is binarized data. If area data represents a rectangle, it requires 

is encoring^ ? " 
[0094] viar^^ 

[0095] bv case of using tie 
felto^r^i^^ 



so 



"sin ? [ (x+I/2 )7r/L] < 0< x 1L/2) 

^ = V : 1 (L/2< x;^N-L/2i (1) 

sUn* [ ( (x-M)+l/2i) rr/L] (N-L/2< x SN) 



30 . ■.■.'-.■■•*: T -: : - ' ; \:7/-^'-v;:^. ' ' ' . ' ; \ "■' .- \ V ;\ . ' .V.-~" ' "• • ' '• ." : : : " ; " : -f ' -■; ' ■' 

[0096]; W tbe ^uatkjn (1). M is equal to "aN" and L is ec^al to "N-M w ("a" is a real value of 0 to 1) "NT represents a 
size of a rectan^ a^ 17 shows an exanrpfe of 

a lini^ functoof ^ 

35 <^ C2) 

In the equation £); a size of the rectangle is expressed by the number of pixels TOc" in hcrtzontal direction and pixels 
*Ny" in vertical direction and the flatness of weight is expressed by "ax* in horizontal drection and by "ay" in vertical 
direction, v y : \ : ' \ Cn *\ '•• ■ > . :/ >'/ ' ■ ■ ^■■■■^y- .-.>'.■ ;'^ f . ^ -•-.••>. 

40 [0097] Various coronations of linear weight functions other^t^ 

[0098] three different m ethods tor preparing an alpha-ptane for an area of any desired shape, fay i^ofewOTple, will 
be described as ttrflows: 

[0099] Tl^ frst metfiod is to determine a c ir c u inscribed rectangle of the area and then to apply the aixve-merrttoned 
linear weight furK^ions to the an^mscribed rectangle in horizontal cfa^ 

45 [0100] the second method is to sequentially deterrrine weight values to be appiedto an ar^yfrbm te ^ 

as shown in Fig. 1 8: For example, pixels at the circumference of the area are determined and are given a w e i g h t of 0.2 
respectively Next, pixels al tte c^cumferehce of a stfll-rwt-weighted pari within the area are given 
a weight of 0.5 respec^ivefy, Therc operations are repeated ur& orcunrferert 
of an afc>%plane is of 10 to a last ri^^ 

so value of l&ait Hs center portion and a value of 0^ at its drcumferer^ pbrfib^ q# Jkka&tf^^ 

from circiOTifeVOTce of an area, ft is possWe to use a Shear weight func^ of the e<^^ 
values. In sequentially changing a weight-value, a a rcumferential pixel thickness may be a single pixel or more. 
[0101] The tttiid me^od is to apply a wei^tf of 0 to the outside of an area and a wiaglil of 1 to ^ 
and then to process a ^ thus binarized image through a lbwi>ass filter to gradate the area bcxro^ 

55 kinds of alpiha-planes can be pr^plaredby changing a size and coefficient of a ffltBf^^^lhe.nurrt^ of fftering opera- 
tions. " JV/ *' -'V J--, - 'l ' - ■■ ^i-v " ' ■ "'. V-.-tv- . ~.7.}.-: r ' 
[0102] As is apparmt from the fcx-egoing, the first entodiment can attain an increased efficiency of date cocfng in 
comparison with the conventional device because the alpha-plane is prepared by the decocfing side, thereby eliminating 
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the need of encoding weight-information. In addition, the decoding device prepares an alpha-plane from the decoded 
area information and synthesizes video-sequences by using the prepared alpha-plane, thereby preventing the occur- 
rence of such a visual defect that a toothed line appears at the boundary of a part image in the background. 
[0103] Another embodiment of the present invention will be described as follows: 

s [0104] Fig. 19 is a block diagram showing a coding device and a decoding device of the embodiment. In Fig. 19, a 
first weighting portion 108, a second weighting portion 109 and adder 111 are similar to those of the conventional 
device and is. omitted from the further explanation. An area-information coding portion 1 03, an area-information decod- 
ing portion 107, alpha-plane generating portions 121 and 122 are similar in function to those of the first embodiment 
and, therefore, will not be further explained. 

to [0105] This embodiment is featured in that the coding side is also provided with an alpha-plane generating portion 
120 for encoding an image with weight values tor synthesizing a plurality of video sequences. Coded data becomes 
smaller than the original data because weight data is not more than 1 , and. therefore, an amount of coded data can be 
reduced. 

[0106] In Fig. 19, a first video-coding portion 122 and a second video-coding portion 123 encode images of video- 
is sequences by weighting on the bass of respective alpha-planes prepared by the coding side. A first video-decoding 
potion 1 24 and a second video-decoding portion 125 decode the coded images of the video-sequences by inversely 
weighting on the basis of respective alpha-planes prepared by the decoding side. 

[0107] the first video-coding portion 122 or the second video coding portion 123 may be constructed for transform- 
coding as shown, for example, in Fig. 20. A video-sequence to be processed is the first video-sequence or the second 
20 video-sequence. A transforming portion 131 transforms an input image by block by using a transforming method such 
as DCT (Discrete Cosine Transform), discrete Fourier transform and Wetblet transform. 

[0108] In Fig. 20. a first weighting portion 132 weights a transform-coefficient with an alpha-plane value. The value 
used for weighting may be a representative of an alpha-plane within an image block to be processed. For example, a 
mean value of the alpha-plane within the block is used. Transform-coefficients of the first video sequence and the sec- 
25 ond video sequence are expressed by g 1 (u,v) and g2(u,v) respectively and they are weighted accord ng to the following 
equations: 

gw1(u,v) = (1 -5)g1(u.v) (3) 

30 gw2(u.v) = ag2(u,v) ^ 

[0109] In the equation (3), gw1(u,v) and gw2(u.v) denote weighted transform coefficients, u and v denote horizontal 
and vertical frequencies, a is a representative of an alpha-plane in a block. 

[0110] In Fig. 20. a quantizing portion 133 quantizes transform coefficients, a variable-length coding portion 134 
35 encodes the quantized transform coeff icients with variable-length codes to generate coded data. 

[0111] A first video-decoding portion 124 or a second video-decoding portion 125, which corresponds to the video- 
coding portion of Fig. 19, may be constructed as shown in Fig. 21. A variable-length decoding portion 141 decodes 
coded data, an inversely quantizing portion 142 inversely quantizes decoded data and an inversely weighting portion 
143 performs reverse operation on transform coefficients to reverse the equation (2). Namely, the transform coefficients 
40 are weighted with weight values that reverse those applied at the coding side according to the toiowmg equation: 

gl(u,v)=gwl(u.v)/(1-S) (4) 

g2(u.v) = gw2(u,v)/u 

45 

[011 2] In the equation (4), * (Hat) indicates decoded data, e.g., gw1 with a hat is a weighted decoded transform coef- 
ficient of the first video-sequence. 

[0113] Besides above-mentioned weighting method, there is such an applicable method that does not weight a direct 
current corrponertt of a transform coefficient and weights other transform coefficients according to the equation (2). In 
so this case, weighting is substantially effected by correcting a quantizing-step width adopted by the international standard 
MPEG or H.261 by using a representative value of the alpha-plane within the block. 

[0114] Namely, a quantizing-step width changing portion 38 is provided as shown in Fig. 21 , whereby a quantizing- 
step width deter rraned by a quantizing-step width determining portion (not shown) is changed by using alpha-plane 
data. In practice a representative a (eg., a mean value) of the alpha-plane within a block is first determined, then the 
55 quantizing-step width is divided by a value (1 - a) for the first video sequence or by a value a for the second video- 
sequence to obtain a new quantizing-step width. 

[01 1 5] There are two inversely weighting methods which correspond to the above-mentioned weighting method. The 
first method relates to a case when a quantizing-step width (without being changed by the quantizing-step width chang- 
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ing portion 138) is encoded by the coding device shown in Fig 22. In this case, the decoding device of Fig. 23. which 
is provided with a quarttizU^-s\ep width changing portion 1 48 corresponding to that of the coding side of Fig. 22. 
decodes the quant|ang?step width by a quantizing-step width decoding portion (not shown) and then changes the 
decoded quantizir^^^ wic^ by the quantizing-step width changing portion 148 accordir»g to tie alpha-plane data. 

5 The s>e^6nd method relates to a case when a quantizing-step width after being changed by the quantizing-step width 
ctianging portion 13^ 22. In this case, the decoding device directly uses 

the decoded quantizing-step width and inversely quantizes it. This eliminates the -use of a special inversely weighting 
<^ice (i.e , the quantizing-step width changing portion 108 of Fig. 23); The second method, however, is considered to 
have aTdwea^ f ^ the first method. 

jo [0116] The abcve-descrtoed second embocfiment uses the transform coding. Therefore, a motion roripensative cod- 
ing H 261 system was omitted from Figs. 20 to 23 This method, however, 
can be applied ta c^ using the motion corrpensative prediction. In this instance, a prediction error for 
motto com^^ prediction is inputted into a transforming portion 131 of Fig. 20. 
[0117] Other ymghting methods in the second embodiment are as follows: 

is [0118] Rg. 24 srKfws an example of the first video-coding portion 122 or the second video-coding portion 123 of toe 
<xriing device srx*vn in Ffig. 19. Namely, the coding portion is provided with a weighting portion 150 which performs 
weighting 6peratk^]before video coding by the standard method MPEG or H.261 according to the following equation: 

■ fw1(x.y) = (1 ^ (5) 

fw2(x.y) = 5f2(x,y) 

[0119] In the equation (5). fwl(x.y) is the first weighted video-sequence, fw2(x,y) is the second weighted video- 
sequence and a is a representative of an alpha-plane within a block. 
25 [0120] Weighting may be effected according 

fw1(x.y) = (1 -a(x.y))f1(x.y) (6) 

fw2(x.y)=a(x.y)f2(x.y) 
30 • •'. • '/•• - ; r • , :•• • 

[0121] Fig. 25 shows an inversely weighting method of the decoding device, which corresponds to the above-men- 
tioned weighting method. The inversely weighting portion 1 61 weights the video-sequence with weight reversing that 
appied by the coding device. 
[frl22] When the 

35 omit the inversely weighting portion 61. the first weighting portion 108 and Wsecond w ei^du^ 
sizing sequences, which are shown in Fig. 19. Namely, it is posstole to use a co^ 

are shown in Fig; 26i 'A first video-cocfing portion 122 and a second video-cocfing portion 123. which are shown iri Fig. 
26. are constructed as shown ni Fig. 24, and use the weighting method of expiation (5). in this instance, weigW informa- 
tion such as area inform is 

40 included in the victo ceded data rtseff, the weighting information does not require encoding. Accordingly; sequences 
decoded by the decocfcng device can be directly added to each other to generate a synthesized sequence. Encoding 
only data within an area is rather effective than encoding a whole image if a video-sequence 102 relates lo a part image. 
In this case, it becomes necessary to encode the area info r ma tion by the cocfing device and to decode the coded area 
ii rfo n nati on by the decoding device. 

45 [0123] The foregoing description relates to an example of weighting each of plural video-sequences in the second 
embodiment of the present invention. For example, the first video-sequence is weighted with a value of (1-o) whie the 
second video-sequence is weighted with a value of a. 

[0124] Although the embociments have been explained in case of synthesizing one ba ckground video-sequence and 
one part vide-sequence, the present invention is not be limited thereto but can be adapted to synthesize plural&y of part 
so video-sequences with a background: In this instance, each area information corresponcfing to each part image is 

[0125] The background image and part images may be independentiy encoded or may be hierarchicaly encoded, 
considering the background image as an lower-layer and the part mages as upper-tayers. In the latter 4»se, each 
upper-layer image can be effectively encoded by predicting its pixel value from that of the tower-layer image 
55 [0126] There has been studied i--a video coding method that is adapted to synftestze cfiftererrt kinds of video 
sequences. 

[0127] Following description shows conventional devices as reference to the present invention. 

[0128] A paper "Image coding using hierarchical representation and multiple templates" appeared in Technical Report 
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of IEICE IE94-159. pp. 99 - 106, 1995. describes such an image synthesizing method that combines a video-sequence 
being a background video and a part-video-sequence being a foreground video (e g. , a figure image or a fish image cut- 
out by using the chromakey technique) to produce a new sequence. 

[0129] A paper "Temporal Scalability based on image content" (ISO/IEC JTC1/SC29/WC11 MPEG9S/211. (1995)) 
s describes a technique for preparing a new video-sequence by synthesizing a part-video sequence of a high frame rate 
with a video-sequence of a low frame rate. As shown in Fig. 27, this system is to encode an lower-layer frame at a low 
frame-rate by prediction coding method and to encode only a selected area (hatched part) of an upper-layer frame at a 
high frame rate by prediction coding. The upper layer does not encode a frame coded at the lower-layer and uses a copy 
of the decoded image of the lower-layer. The selected area may be considered to be a remarkable part of image, e.g., 
io a human figure. 

[0130] Fig. 28 is a block diagram showing a conventional method, at the coding side, an input video-sequence is 
thinned by a first thinning portion 201 and a second thinning portion 202 and the thinned video-sequence with a 
reduced frame rate is then transferred to an upper-layer coding portion and an loweMayer coding portion respectively. 
The upper-layer coding portion has a frame rate higher than that of the lower-layer coding portion. 
is [01 31 ] The lower-layer coding portion 204 encodes a whole image of each frame in the received video-sequence by 
using an international standard video-coding method such as MPEG. H.261 and so on. The lower-layer coding portion 
204 also prepares decoded frames which are used for prediction cocfing and, at the same lime, are inputted into a syn- 
thesizing portion 205. 

[01 32] Ftg. 29 is a block diagram of a code-amount control portion of a conventional coding portion. In Fig. 29. a cod- 
so ing portion 212 encodes video frames by using a method or a combination of methods such as motion compensative 
prediction, orthogonal transformation, quantization, variable length coding and so on. A quantization-width (step-size) 
determining portion 21 1 determines a quantization-width (step size) to be used in a coding portion 212. A coded-data 
amount determining portion 213 calculates an accumulated amount of generated coded data. Generally, the quantiza- 
tion width is increased or decreased to prevent increase or decrease of coded data amount 
25 [01 33] In Fig. 28 the upper-layer coding portion 203 encodes only a selected part of each frame in a received video- 
sequence on the basis of an area information by using an international standard video-coding method such as MPEG, 
H.261 and so on. However, frames encoded at the lower-layer coding portion 204 are not encoded by the upper-layer 
coding portion 203. The area information is information indicating a selected area ot e.g.. an image of a human figure 
in each video frame, which is a binarized image taking 1 in the selected area and 0 outside the selected area The 
30 upper-layer coding portion 203 also prepares decoded selected areas of each frame, which are transferred to the syn- 
thesizing portion 205. '2 
[0134] An area-irtfbrmation coding portion 206 encodes an area information by using 8-cfirectional quantizing codes. 
The 8-directional quantizing code is a numeric code indicating a direction to a proceeding point as shown in Fig. 30 and 
rt is usually used for representing digital graphics. 
35 [0135] A synthesizing portion 205 outputs a decoded loweHayer video-frame which has been encoded by lower-layer 
coding portion and is to be synthesized. When a frame to be synthesized but has not been encoded at the lower-layer 
coding portion, the synthesizing portion 205 outputs a decoded videogame that is generated by using two decoded 
frames, which have been encoded at the tower-layer and stand before and after the lacking lower-layer frame/and one 
decoded upper-layer frame to be synthesized. The two lower-layer frames stand before and after tfie upper-layer frame. 
40 The synthesized video-frame is inputted into the upper4ayer coding portion 203 to be used therein for predictive coding. 
The image processing in the synthesizing portion 203 is as follows: 

[01 36] An interpolating image is first prepared for two lower-layer frames. A decoded image of the tawer-teyer at time 
t is expressed as B(x.y.t), where x and y are co-ordinates defining the position of a pixel in a space. When the two 
decoded images of the lower-layer are located at time t1 and t2 and the decoded image of the upper-layer is located at 
45 t3 (t1 <t3<t2), the interpolating image I(x,y,t3) of time t3 is calculated according to the following equation (1): 

»<x,y.t3) = [(t2-13)B(x f y,t1) + (t3-t1)B(x,y.t2)y(t2-t1) (1) 

The decoded image E of the upper layer is then synthesized with the obtained interpolating image I by using synfriesiz- 
so ing weight infonnation W(x,y,t) prepared from area information. A synthesized image S is defined according to the fol- 
lowing equation: 

S(x.y i t)=[1-W(x.y.t)]Kx,y.tKE(x,y.t)W(x.y,t) (2) 

ss The area information M(x.y.t) is a binarized image taking 1 in a selected area and 0 outside the selected area. The 
weight information W(x,y,t) can be obtained by processing the above-mentioned binarized image several times with a 
low-pass filter. Namely, the weight information W(x.y.t) takes 1 within a selected area, 0 outside the selected area and 
a value of 0 to 1 at boundary of the selected area. 
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[01 37] Jhecodedj^ the lower-lay^ coding portion, the upper-layer coding portion and the area infor- 

matkxi coding^pc^ (not shown) and then is transmitted or accumulated. 

[0138] In the diec^n^ side of me conventional sy«^ data disassembling portion (not shown) separates 

coded data into low^ coded data, upper-layer coded data and area- in for mati on coded data. These coded data 
5 are decoded respectively by ah lower-layer decoding portion 208i an upper-layer decoding portion 207 and an area 
:. infomiatic^^^ .-y^/^^ 
[0139J Vs^^ 

^ynthe^zesJah image ty; using a decoded lower-layer irhage and a decoded upper-layer image according to the same 
niethoj as described the coding side, the synthesized vkteo frame is displayed on a cfspiay screen and, at the same 

io time^ 

[0140] Th^4bc^^e decoding device decodes both lower-layer and me upper-layer frames, but a decoding 
device cpns^^ iow^-layer decoding portion is also app^/ omitting the upper-layer coding portion 204 and the 
synthe*!^ 

VftW present iiwentk^ls lN 

is tion i^ shown in FigJ 2a^ 
an image fronitwj^^ 

rence of aftenmageHike distortion around the selected area or areas. Fi& ^ is an image syn- 

thesizir^^ T" ^ ., : . 

[0142] ln r R0^32i; a iin^ areansxtracting portion 221 is to extract an area, which relates; to a first area and does not 
20 reiate to a second arearff<^n a first area imNarmatibh of an lower-layer frame and a second area irribrmafcn of ah Ipwer- 

ivi^rS^^^^F^; 33(a), the frst area inforirn^^ 
ck>tted area) and th%M<x^arW 

area to be extracted^ extracting portic^ part shown in Fig. 33. 

[0143] A second area extracting portion 222 of Figi 32 is interried to extract an area wt^ second area 

25 ai^dbes^V^^tothe^ 
ofariioWer^ 

|01^ < ln:^^ ax^ a siiM output ^ 

theiscxftrt^ the switch ?21: is ^Gbmwct^ 

posito;^ 

30 ren^^^ is conr>ect«T to an outp^ 

ihgpbr^^ -^V- : r 

[014S?^ calculates arvinteri^ 

low^l^^ imaged 

EquaiOT (1)^ image is expressed asB(x.y.t1). the second decoded image te^^^ 

35 antf^jnt*^^ 

dnd dhkxx^ respectively. - v ;V'V : v ; ^v 

[0146] J*e|^r^ ima^ Jhus generated is featured by *iat the hatd^ 

a background image, outside the selected. area, of the second decoded frame, a dotted 
iniage/outeicte the selected area, of the first de^ arid other portions «a «k>d w^^ 

40 between tte ifrst and second decoded frames. The upper-layer decoded image is tfie^<^^ 
tiohedipn2^^ 

the selected (hatched) area and is.free from thedistorlion CKXun^inthepnbrartimaga TO 

ten 226 combines the interpolating image with the t^per-teyer decoded mage by using wi^ dod ineans. 1>» weigtrt^ 
averaging method was descnt^ b^ : 
45 [0147] Intheafc it is also possible to use, instead of the meaivweiyl rfii i g portion 22S, pixel 

values of either the first decoded image B(x,y.t1) cr the second decoded image B(^yl2); which is temporafiy nearer to 
thefme mark t3 ol the upper-layer image, in this instance, the interpolating image tmay be expressed by uskig ^ 
number as (blows: 

so Kx.y.t3) « 3(x.y.t1) in case of 13-tl < tl-12 or 

I(xy,t3) » B(xy.t2) in all other cases. 

In the expressions, 11 , t2 and t3 denote time marks of the first decoded image, the second decoded image and tie 
55 upper-layer decoded image. "^' ^ 

[0148] Arwther embodinTem of the ^ 

[0149] This embodiment relates to an image synthesizing device which is based on the first emboti&nent and is capa- 
ble of generating a more accurate synthesized image with consideration of motion alternation of lower-layer decoded 
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images. Fig. 34 is a Hock diagram showing a device for predicting a motion parameter and modifying area information 
of two corresponding frames. 

[0150] In Fig. 34, a motion-parameter estimating portion 231 estimates information about the motion from afirst lower- 
layer decoded image to a second lower-layer decoded image by determining motion parameters, e.g., motion vector per 

5 block and a whole image movement (parallel displacement, rotation, enlargement and contraction). 

[01 51 ] An area-form modifying portion 232 modifies the first decoded image, the second decoded image, the first area 
information and the second area information according to respective predicted motion parameters based on the tem- 
poral positions of the synthesizable frames. For example, a motion vector (MVx,MVy) from the first decoded image to 
the second decoded image will be determined as a motion parameter. MVx is a horizontal component and MVy is a ver- 

io tical component A motion vector from the first decoded image to the interpolating image is determined according to the 
equation: (t3-tiy(t2-tl)(MVx t MVy) . The first decoded image is then shifted according to the obtained vector. In case of 
using other motion parameters such as rotation, enlargement and contraction, the image is not only shifted but also be 
deformed Jn Fig 34, the deformed (modified) data sets "a\ "b", "c" and "d". which relate respectively to the first 
decoded image, the second decoded image, the first area information and the second area information of Fig. 32. 

is These data sets are inputted into the image synthesizing dance shown in Fig. 32, which generates a synthesized 
image. Although the above-described embodiment predicts the motion parameters from two decoded images, it may 
also use a motion vector of each block of each image, which is usually included in coded data prepared by predictive 
coding. For exarrple, a mean value of the decoded motion vectors may be applied as a motion vector of a whole image 
from the first decoded frame to the second decoded frame. It is also possible to determine a frequency cfstribution of 

20 decoded motion vectors and to use a vector of highest frequency as a motion parameter of a whole image from the first 
decoded frame to the second decoded frame, The above-mentioned processing is performed independently in a hori- 
zontal direction and a vertical direction. 

[01 52] Another embodiment of the present invention is as foBows: 

[0153] This errtoodiment relates to an area-information coding device being capable of effectively encoding an area 
25 information. Figs. 35 and 36 are block diagrams of this enTfcocfiment whose coding side is shown in Fig 35 and decod- 
ing side is shown in Fig. 36. 

[01 54] In Fig. 35. an area-information approximating portion 241 approximates an area information by using a plurality 
of geometrical figures. Fig. 37 shows an example of approximation of an area information of a human figure (hatched 
portion) with two rectangles. One rectangle 1 represents a head of a person and the other rectangle 2 represents a 

30 breast portion of the person. ' 
[0155] , An approximated-area information coding portion 242 encodes the approximated area-information. An aria 
approximated by rectangles as shown in Fig. 37 may be encoded with a fixed code length by encoding coordinates of 
a left top point of each rectangle and a size of each rectangle with a fixed code length. An area approximated by an 
ell'pse may be encoded at a fixed code length by encoding coordinates of its center, long axis length and short axis 

35 length. The approximated area-information and the coded data are sent to a selecting portion 244. 

[0156] Like the area-information coding portion 206 described in Fig. 28, an area-information coding portion 243 of 
Fig. 35 encodes an area-information by using an 8-directional quantizing code without approximation. The area infor- 
mation and the coded data are sent to a selecting portion 244. 

[0157] The selecting portion 244 selects either one of two outputs 242 and 243. With the output 243 being selected, 
40 the coded data of the approximated area information with single-bit (e.g., 1) selection information is sent to a coded- 
data integrating portion (not shown) and approximated area information is sent to a synthesizing portion (not shown). 
With the output 344 being selected, the coded data of the not-approximated area information with one bit (e.g., 1) of 
selection information is sent to a coded-data integrating portion (not shown) and the not-approximated area information 
is sent to a synthesizing portion according to the present invention. 
45 [0158] The selecting portion may operate, for example, to select an output which may produce smaller amount of 
coded data or to select the output 244 when an amount of coded data of the not-approximated information does not 
exceed a threshold value and the output 242 when said amouit exceeds said threshold value. This makes it posstole 
to reduce the amount of coded data, preventing the area information from being distorted. 
[0159] The operation of the decoding stde of this embodiment is as follows: 
so [0160] In Fig. 36, a selecting portion 251 selects which kind of area-information - approximated or not-approximated 
- according to the single-bit selecting information contained in the received coded data. 

[0161] In Fig. 36. an approximated-area-inforniation decoding portion 252 decodes the approximated area informa- 
tion, whereas an area- in foi n a tion decoding portion 253 decodes the not-approximated area information. A witch 254 
is controlled by a signal from the selecting portion 251 to select an approximated area-information or rKrt-approximated 
55 area-information as an output to a synthesizing portion. 

[0162] Either approximated area information or rwt-approximated area information is thus adaptrvely selected, 
encoded and decoded. When area information ts complicate and may produce a large amount of coded data, the 
approximated area-information is selected to encode the area-information with a small amount of information. 
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[0163] in the ^ve-mentioned case! the not-approximated area information is encoded by using e-directional quan- 
tizing ccwles, but it may be more effectively encoded by using a combination of 8-directionaJ quantization with predctive 
coding An 8-directipnal quantizing code takes 8 values from 0 to 7 as shown in Fig. 30. which are differentiated to be 
from -7 to 7 by pr^ictiye coding. A cfifference; however, may be limited to a range of -3 to 4 by adding 8 if the deference 
being -4 or le^ and by subtracting 8 rf the difference is more than 4 In decoding, an original 8-dkectional quantization 
value be obtained by first adding the difference to the precedent value and then by subtracting or adding 8 when 
the result is hegatiW value or exceeds 7. An example is shown bellow: 



10 



8-directional quantization value 


1, 


6, 


2. 


1. 


3 




Difference 




5. 




-1. 


-2 




Converted value 




-3, 


4. 




2. 




Decoded value 


1. 


6. 


2. 


1. 


3; 





For Sample, a difference between a quantization value 6 and a precedent value is 5 from which 8 is subtracted to 
obtain a result <rt &i In decoding, -3 is added to the precedent value 1 and a value -2 is obtained, which is negative and 
so therefore is increased by adcfng 8 thereto to finally obtain a decoded value 6 Such predictive coding is effected by uti- 
lize the ^c^ 

[0164] Although this embodimertt encodes an approximated area-information of each image independently, it is pos- 

sfele to increase the efficiency of coding using the preceding coding result because video frames usuatty have a high 

interframe con-etetk^n Name^, onty a cfifference of approximated area information off two successive frames is encoded 
25 H the approximated area information is continuously encoded between two frames. When, lor example, an area is 

approximated by a rectangle a rectangle of a preceding frame is expressed by its left-top position (19. 20) and size 

(100. 1^) and a rectangle of a current frame is expressed by te left-top position (13, 18) and size (100. 1S2). adiffer- 

entiaHeft-toppostocH^ ard differential size (0. 2) ^ ^ 

sma&manno^ 
30 man coding b^u^ cffferer*^ 

vary in many times, it is effective to encode single-brt in for mati on as rectangle chaige inrformation on a current tame. 

Namely; single-bit inforrmrtk>a^ current frame whose rectangle does not vary; whereas single** 

informaticm (eg:. 1 ) and cfifference information am encoded ^ 

101651 A^ 

35 [01661 this WrtoodBment relates toawei^Hnb generating device tor prepar^ weight tnforma- 

ton from an area inforrratiort Figu 38 Is a block diagram of this emltxxfimerit^^^^^^^'^^^^ r 

[0167] in Rg. 313. a horizontal weight generating portton 261 horizontaBy scans an area in fo rmati on and detects 1 
therein; then calculates a corresponding weight function, in practice, the abscissa xO of a left-end point and the hori- 
zontal length N of the area are first determined and then a horizontal weight function is calculated as shown in Rg. 
40 39(a) . The weight function may be prepared combining straight fines or by combMng a line with a trigonometric tone- 
ton. An exarrple of the latter case is descrtoed below. If N > W (W is a width of a trigonometric function), toe toBowing 
weight functions may be applied: 

*sii<(xvl /2)*42W)] 

45 

•1 if Wsix<N-W; 
•sin[(x-N+2^1V2)*42W^ 
so *Sffi2[(x+1/2)^xsir<(x+1/2)«^ttNs2W 
In the above-mentioned case, the left-end point xO of the area is set at 0. 

[0160} In Fig. 38. a vertical weight generating portion 502 vertically scans the area in fa matio n and detects 1 therein, 
then calculates a corresponcfing vertical weight function. In practice, the bftfinmeyOoTa and the vertical 

55 length M of the area are determined, then a vertical weight function is calculated as shown in Fia 39(b). 

[0169] Amult^ier 263n^ittpBesanoutXit261 by an output 262 at each pixel position to generate a weight informa- 
tion. 

[0170] The abew^entioned method irrfor mation adapted to toe form of the area infor mat ion wrth 
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a reduced number of operations. 

[01711 Another embodiment of the present invention is as follows: 

[0172] This embodiment relates to a method for adaptively switching coding mode from interframe prediction to intra- 
frame prediction and vice visa in predictive coding lower-layer or upper-layer frames. Rg. 40 is a block diagram of this 
5 embodiment. 

[0173] In Fig. 40. a mean-value calculating portion 271 determines a mean of pixel values in an area according to an 
irput original image and an input area-information. The mean value is inputted into a differentiator 273 and a storage 
272. 

[01 74] The differentiator 273 determines a difference between a preceding mean value stored in the storage 272 and 

10 a current mean value outputted from the mean-value calculating portion 271 . 

[01 75] A discriminating portion 274 compares an absolute value of the difference calculated by the differentiator 273 
with a predetermined threshold value and outputs a mode-selecting information. With the absolute vakie of the differ- 
ence being larger than the threshold, the discriminating portion 273 judges that a scene change occurs in a selected 
area and generates a mode selecting signal to always conduct the intraframe prediction coding. 

is [0176] Mode selection thus effected by judging a scene change of a selected area is effective to obtain high-quality 
coded images even when, lor example, a person appears from behind the cover or any matter is turn over. The shown 
embodiment can be applied for system for coding a selected area separately from other areas in encoding lower-layer 
frames. In this case, area information is inputted into the lower-layer coding portion. This embodiment can be also 
applied for coding only a selected area of the upper -layer frame. 

so [0177] Another embodiment of the present irvention is as foiows: ^ 

[0178] This embodiment relates to a method for controlling an amount of data in case of encoding a separate area 
separately from other areas of each lower-layer frame. Rg. 41 is a block diagram of this embodiment 
[0179] In Fig: 41 , a coding portion 283 separates a selected area from other areas and encodes it. An area discrimi- 
nating portion 281 receives an area information and discriminates whether the encodable area is within or outside the 

2$ selected area. A coded-data-amount estimating portion 285 estimates an amount of coded data in each area on the 
basis of the above-mentioned discrimination result. A distributing ratio calculating portion 284 determines dtstrfoubng 
ratios of a target amount of codes per frame, which will be allocated to areas. The method for determining distributing 
ratios wiH be described later. A quantizing width calculating portion determines a quantizing step-size according to t*9& 
target amount of coded data. The method for determining quantizing step-size is the same as the conventional method. 

30 [0180] The method for determining a code distributing ratio by the target code-allocation calculating portion is as fol- 
lows: z. 

[0181] A target code-amount Bi of a frame is calculated according to the following equation: 

Bi = (The number of usaHe bits - The nurrt 
35 number of remaining frames 



J0182] This target nurrfcer Bi of bits is distributes at a specified ratio to pixels within a selected area and pixels outside 
the selected area. The ratio is determined by using an adequate fixed ratio RO and a preceding frame complexity ratio 
40 Rp. The complexity ratio Ftp of the preceding frame is calculated by the following equation: 

Rp^gen_WF*avg_qF)/(gen_brtF*avg_qF-^en_brtB*avg_qB) 

where gen_bitF = The number of bits for coding pixels in a selected area of a preceding frame, gen_bitB = The number 
45 of bits for coding pixels outside the selected area of a preceding frame, avg_qF = An average quantization step-size in 
the selected area of a preceding frame and avg_qB = An average quantization step-size outside the selected area of a 
prececfing frame. To encode a selected area at a high image quality, it is desirable to adjust a quantizing step size to 
keep an average quantizing step-size in the selected area somewhat smaller than frat outside the selected area and at 
the same time to follow up the change of an image in a sequence of moving pictures. Generally, cfistrfcution at a fixed 
so ratio RO is adapted to maintain a sdastantiaBy constant relation of quantization step-size between pixels in the selected 
area and pixels outside the selected area, while distribution at a complexity ratio Rp erf a preceding frame is adapted to 
following up the change of an image in a sequence of moving pictures. Accordingly, the present invention is intended to 
use a combination of advantages of both methods by making a targel-bit-amount cfistributing ratio be an average of the 
fixed ratio RO and the preceding frame corrplexity ratio Rp. Namely, the (fistrixition ratio Ra is determined as foflows: 
55 Ra = (RO+Rpy2 

[0183] In Rg. 42. there are two exemplified curves plotted by dotted lines, which represent the fixed ratio RO and the 
preceding frame axnplexrty ratio Rp in a selected area for a whole video sequence. In this example, a soid-ine curve 
of Fig. 42 relates to the obtainable ratio Ra for distributing a target coded-data-amount, which does not so far part from 
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the curve of fixed ratio and reflects; to a certain extent a change of an image in a video sequence. At a fixed ratio CI- 
RC)) and prececf ng frame complexity ratio (1 -Rp) for the outside of the selected area, an average ratia which isa target- 
bit-amount distrixlti pixels outside the selected area, takes a solid-fine-ptotted curve shown in Fig. 43. 

A total oj^^ 

s [01 84]' l^eq^^ti^ion step-size can be thus ada^ely confrd^ of a whole video sequence, howwer, 

may some tinte eaceed a predetermined value because the number of bite used exceeds the target value Bi in some 
frames. In this case, the follpwing method may be applied. 

[0185] As described above, the target-tnt-amount distributing ratio Ra for coring pixels in a selected area ts a mean 
value A:l^'i^^^f^:fG and the preceding w 
w for a>ding pixels outside the selected area is a minimd value Rm of the fixed ratio (1-RO) and preceding frame com- 
plexity r&k> (i-l^ ^c^jrig pixels outside the selected farea in this case, the target-bit-amount disli^xiting ratio (1 -Ra) 
for coding pixels outside the selected area ^ as shown by a sofid fine in Fig. 44. Ab Ra + Rm s 1 . 

tte target number of brts can be reduced tor a frame or frames wherein excess bits may occur. In other words, tie bit 
rate of a whole vif^ sequence may be kept within the pred^ermined limit by reducing the target bitHBrnoinit of a back- 

[018iq f^i^ and decoding devices according to the present invention, it is possfale to encode a 

jetted 

lP*B7lK ft is possdale to decode o^ coded data is 

• decoded /• - ~ v -K^'; . ' " " • . -.' • : " • " - ' ; '•. • '. . \ ;/ ^ . V ;. • 

20 it is possible to select which the first upper-layer or the second upper-layer 

te d^ is decoded Id be of a Icwertim^ is selected. whereas only a 

selected area is deeded to be of a high image-quality if the second upper^ayer is selected. 

[0189] In decoding^all coded data, an image can be decoded in such a wary that a selected area of the image may 
have a rmgher image qualty than that erf all ot^ 
25 [01M]^;A^ embocSmerrts of the present invention presumed that thedeoocfing 

device receive 

decorfng terrrinaJ r^ue^ac^ 
data^ 
Nam^;^ 
30 misskpnShe^^ 

some w^ler bandwic^ or tran 

[019t ]^ % «s possfcle to reduce an anxxmt of codted 

data because waited mean information is prepared from btnarized in for m a tion inputting a pturaity ^l^^ifetafr 
sequences onto Abadtgr^ 
35 pre|W«£fromtte^^ 

irrages can be smoothly synthesized without oc^^ any visual delect 

[0192] In weighting still-not<xded data using weigm values to be used for synthesizing video-sequences, the amount 
off coded data can be reduced or the quality of decoded image can be improved at the same amount of me coded data 
as compared witfvthe prior art devices., 
40 [0193] Trie video^coding device according to me present invention is intended to: 

(1) synthesize a not-coded lowertayer frame from preceding and proceeding lower-layer frames by weighted aver- 
aging two lower-layer frames existing temporally before and after the synthesizabie frame for an overlapped portion 
of a f ir^part area with a second parts area or an area not betonging to the first part area and ^ 

45 by using an lower-layer frame existing temporally after the synthesi za bie frame for a part of only the first part area 
and by using an lower-layer frame existing temporally before the synthesizabie frame for a part c* only the second 
part area^ thereby obtaining a synthesized image of a high quality with no cfetorSon even wlM^a^ 

(2) synthesize the k>*er4ayer frame (1) using an lower4ayer frames existing tenpora^ r^ 
We frame for an overlapped porti^ 

so part areaard the second part ara cr by using o^ 

thereby obtaining a synthesized image of a high quality with no double vision of ^syntf^ 
even when the background image moves; 

(3) syrrthesize trwHowetf-layer frame (1) by modifying (deforming 

frame, trW first part area arid the second part area by motion compensation of motion param^^ 
55 temporal position of the synthesizabie lower-layer frame, thereby obta ini n g a synthesized image of a Nghquaityto 
follow the movement of a backgrpc^ image of the lw 

(4) synthesize tie lower-layer frame (3) by using motion vector information obtained by motion compensative pre- 
diction cocting, thereby obtaining a motion parameter with reduced amount of processing than the case erf newly 
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predicting a motion parameter; 

(5) adaptively selecting either approximating the area information by a plurality of geometrical figures or encocfing 
without approximation, thereby effectively encoding and decoding area information; 

(6) convert area information (5) into eight-directional quantized data, determine a difference between the eight- 
directional quantized data and encode and decode the difference data by variable-length coding, thereby more effi- 
ciently conducting reversibly coding and decoding area information; 

(7) further efficiently encode and decode approximated area information (5) by determining interframe difference of 
geometrical figure information, encode and decode by variable-length coding method, adding information indicat- 
ing no change of area-information without encocfing other area information when the difference data being all 0; 

(8) horizontally scan area information to detect a length of each line therein and determine a horizontal weight func- 
tion; vertically scan the area information to detect a length of each line thereof and determine a vertical weight func- 
tion: generate many-valued weight information, thereby efficiently generate weight information by a weight- 
information generating device when synthesizing an upper-layer part-image with an lower-layer frame by weighted 
averaging method; 

(9) encode and decode video frames by using area information indicating a shape of matter or a shape of a part, 
determines a mean value of pixels in an area from input image and corresponding thereto area information, calcu- 
late a difference between average values cf a preceding frame and a current frame, compare the cfifference with a 
specified value and select the intraf rame coding when the difference exceeds the specified value, thereby making 
it possible to correctly change over the coding mode from the predictive (interframe) coding to the intraframe coding 
when a scene change occurs and assuring a high quality of cocing and decoding images: 

(10) separate a video-sequence into background image areas and a plurality of foreground part-images and sepa- 
rately encode each separated background area and each part-image area by determining whether coded data and 
codable blocks exist in or out of a part area, by separately calculating the coded data amount in the part image area 
and the coded data amount in the background image area and by determining target -bit-amount distribution ratios 
for the part-image area and the background-image area, thereby assuring correct distribution of the target number 
of bits to obtain a high quality of coded images. 

Claims * 

1 . A video cocfing device comprising: ^ 

first coding means for coding a vkleo sec^ence of a backgramd; 
secorxTcoding means for atfir^ 

area-information coding means for coding a binary area information representing a shape of apart video, char- 
acterized in that the device is further provided with a weight date preparing means for preparing multivalued 
weighting data from the binary area-information and gives weight to each of the video sequence according to 
the weight data. 

2. A video cocfing device according to claim 1 , characterized in that a representative value of the weight data for each 
coded block is determined and each of the video sequences is weighted on the basis of the corresponding repre- 
sentative value of the weight data 

3. A video coding device according to claim 1 , characterized in that a representative value of the weight data for each 
converted block is determined and each of the video sequences is weighted on the basis of the corresponding rep- 
resentative value of the weight data. 

4. A video decoding device for decoding coded data prepared by the video coding device of claim 1 , comprising: 

first decoding means for decoding a video sequence of a background; 

second decocfing means for decoding a video sequence of at least a part of a front image; 

area-information decocfing means for decoding a binary area information representing a shape of a part video; 

weight-data preparing means for preparing multivalued weighting data from the binary area-information; 

weighting means for providing each video sequence with a weight reverse to that given by the video-cocing 

device of daim 1: and 

synthesizing means for synthesizing each video sequence weighted by the weighting means. 

5. A video decoding device according to claim 4. characterized in that a representative value of the weight data for 
each decoded block is determined and each of the video sequences is weighted on the basis of the corresponding 
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representative value of the weight data. 

6. A video decodi r>g device according to claim 4, characterized in that a representative value of the weight data for 
each bcyiverted btock is determined and each of the video sequences is weighted on the basis of the corresponding 
repre^ntetr^^ufev of the weight data. 

7. A vioW^ing a^ system comprising: 

a vkjra coding device haying first coding means for coding a vfoeq sequence of a background, second coding 
rnea& : ^coc^ of at least a part of a fr<W irrwge;^ coding means for cod- 

ingalrirw 

a vibted deaxJing dev^ first decoding means for decoding a video sequence of a background, second 
4^6dihg means tor decoding a video sequence of at least a ol a fr^ decoding 
means for decoding a area irrfbrmation representing a shape of ;a part video, weight-d^ preparing 

means for preparing multivalued weighting data from the binary area4i ifoi mation, weighting means for provid- 
ing each video sequence with a weigfa arKl syr^^ sequence 

a A video -i^jnfl'-^ for forming a 

background videa^ to combine each 

pah-vkie^s^uence with the background v^^ which 
coding 

9. A video axi^ defvice as def ined in claim 8. characterized in that file weight vahjes for weighted-mean synthesizing 
are prepared from data shewing area-posftion-and-shape information for a plurafity of part-video-sequenoes and 
weights each v^ 

10: A video coding device as defined in daim 9. cha provided wim a cerfng portion (101 , 102, 122. 

1 23) for encoding an imagei bf the video by Hock, a portion (1 10. 121 ; 120) for obt^ 
value for each coding block and a portion (108/109) for weighty each bio* 

11 . A video cot^xtefk&as defined in claim 9. cteracterized m that it is provided with aportion (122, 123, 131 to 134) 
for transforming and coding, a portion (120) for obtaining a representative we^-vaiue by eac* transfon^ 
antiaport^ 

12. A video decoding device for decoding coded data generated by tie vicfoo a 

is capable of decoding areaiDOSftiorvand-shape information of a plurafity bf part-video-sequences, preparing 
weight values for synthesizing with weighted mean values from the ctecoded v ea-poeitkxvand-shape information, 
combining each pan-video-sequence with a background video-sequence by using weighted me^ values. 

13. A video decoding device for decoding coded data generated by the video coding device defined in claim 9. which 
is capable of decoding area^posrtion-and-shape information of a plurafty of pert-video-sequences, preparing 
weight values for synthesizing with weighted mean values from the decoded area-posrtkxeand-shape information 
and weighting each part-video-sequence with weight values being reverse to those apptied at the coding device 

14. A video decoding device for decoding the coded data generated by the video coding device defined in claim 10. 
which fe provided with a decoding portion (105, 106. 124. 125) for decoding an image of ^ 

capable bf obtaining representative weight values for each decoded block and weighting each decoded block with 
representative weight values being reverse to tho6e applied at^ 

15. A video decoding device for decoding the coded data gerterated by the videa defined in cteurn 11, 
which is provided with a portion (124; 125. 141 to 144) for transforming and depodirig and is 
representative weight values for each transformed block and weighting each tran^ the represent- 
ative weight-values being reverse to those applied at the coding device. 
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